Information Extraction Using Metadata andSolving Polysemy Problems

نویسنده

  • Bharathi
چکیده

Data mining is the exploration and evaluation of large quantity of data to discover substantial, novel, useful and effectively understandable data. Hence determining the knowledge of a document becomes a necessary task in data mining. There are three approaches of metadata in general. They are stylistic, machine learning and knowledge bases. Sometimes the problem occurs when mining a document that contains polysemic words which leads to irrelevant extraction and increased processing time. Polysemy refers to coexistence of many possible meaning for a word or phrase. In order to extract exact information, polysemy like issue should be solved. This work uses knowledge based metadata to extract information using Domain-based Information Extraction technique (DIE). Hence this work targets in solving polysemy which can increases the accuracy of information extraction and reduce processing time. By applying this method to a enormous amount of Engineering domains contains fields like computer science, biomedical, nanotechnology, physics, this work shows that the information extraction is efficient for day-to-day applications with reduced processing time. Keyword-Data mining, Information extraction, Metadata, Polysemy, Domain-based extraction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Biomedical Term Polysemy Detection

Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a nov...

متن کامل

Unsupervised Metadata Extraction in Scientific Digital Libraries Using A-Priori Domain-Specific Knowledge

Information extraction from unstructured sources is a crucial step in the semantic annotation of content. The challenge is in supporting an high quality automatic approach (or at least semi-automatic) in order to sustain the scalability of the semantic-enabled services of the future. Unsupervised information extraction encompasses a number of underlying research problems, such as natural langua...

متن کامل

Towards Large-Scale Unsupervised Relation Extraction from the Web

The Web brings an open-ended set of semantic relations. Discovering the significant types is very challenging. Unsupervised algorithms have been developed to extract relations from a corpus without knowing the relation types in advance, but most of them rely on tagging arguments of predefined types. One recently reported system is able to jointly extract relations and their argument semantic cl...

متن کامل

A New Framework for Unsupervised Semantic Discovery

This paper presents a new framework for the unsupervised discovery of semantic information, using a divide-and-conquer approach to take advantage of contextual regularities and to avoid problems of polysemy and sublanguages. Multiple sets of documents are formed and analyzed to create multiple sets of frames. The overall procedure is wholly unsupervised and domain independent. The end result wi...

متن کامل

Data and Methods for the Production of National Population Estimates: An Overview and Analysis of Available Metadata

Thomas Spoorenberg Translated by: Elham Fathi Statistical Center of Iran Abstract. Official population estimates can be produced using a variety of data sources and methods. These range from the direct extraction of information from continuously updated population registers to procedures for updating the status of a population enumerated previously in a periodic census. Additional sources and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013